Exhaustive search of linear information encoding protein-peptide recognition
نویسندگان
چکیده
High-throughput in vitro methods have been extensively applied to identify linear information that encodes peptide recognition. However, these methods are limited in number of peptides, sequence variation, and length of peptides that can be explored, and often produce solutions that are not found in the cell. Despite the large number of methods developed to attempt addressing these issues, the exhaustive search of linear information encoding protein-peptide recognition has been so far physically unfeasible. Here, we describe a strategy, called DALEL, for the exhaustive search of linear sequence information encoded in proteins that bind to a common partner. We applied DALEL to explore binding specificity of SH3 domains in the budding yeast Saccharomyces cerevisiae. Using only the polypeptide sequences of SH3 domain binding proteins, we succeeded in identifying the majority of known SH3 binding sites previously discovered either in vitro or in vivo. Moreover, we discovered a number of sites with both non-canonical sequences and distinct properties that may serve ancillary roles in peptide recognition. We compared DALEL to a variety of state-of-the-art algorithms in the blind identification of known binding sites of the human Grb2 SH3 domain. We also benchmarked DALEL on curated biological motifs derived from the ELM database to evaluate the effect of increasing/decreasing the enrichment of the motifs. Our strategy can be applied in conjunction with experimental data of proteins interacting with a common partner to identify binding sites among them. Yet, our strategy can also be applied to any group of proteins of interest to identify enriched linear motifs or to exhaustively explore the space of linear information encoded in a polypeptide sequence. Finally, we have developed a webserver located at http://michnick.bcm.umontreal.ca/dalel, offering user-friendly interface and providing different scenarios utilizing DALEL.
منابع مشابه
Evaluation of Cell Penetrating Peptide Delivery System on HPV16E7 Expression in Three Types of Cell Line
Background: The poor permeability of the plasma and nuclear membranes to DNA plasmids are two major barriers for the development of these therapeutic molecules. Therefore, success in gene therapy approaches depends on the development of efficient and safe non-viral delivery systems. Objectives: The aim of this study was to investigate the in vitro delivery of plasmid DNA encoding HPV16 E7 gene...
متن کاملTo Weight or Not to Weight: Source-Normalised LDA for Speaker Recognition Using i-vectors
Source-normalised Linear Discriminant Analysis (SNLDA) was recently introduced to improve speaker recognition using i-vectors extracted from multiple speech sources. SNLDA normalises for the effect of speech source in the calculation of the between-speaker covariance matrix. Sourcenormalised-and-weighted (SNAW) LDA computes a weighted average of source-normalised covariance matrices to better e...
متن کاملEfficient Point-to-Subspace Query in ℓ1 with Application to Robust Object Instance Recognition
Motivated by vision tasks such as robust face and object recognition, we consider the following general problem: given a collection of low-dimensional linear subspaces in a high-dimensional ambient (image) space, and a query point (image), efficiently determine the nearest subspace to the query in ` distance. In contrast to the naive exhaustive search which entails large-scale linear programs, ...
متن کاملCluster - preserving embedding of proteins by
Similarity searching in protein sequence databases is a standard technique for biologists dealing with a newly sequenced protein. Exhaustive search in such databases is prohibitive because of the large sizes of these database and because pairwise comparisons are slow. Heuristic techniques, such as FASTA and BLAST, are useful because they are fast and accurate, though it has been shown that exha...
متن کاملASRWord Lattice Translation with Exhaustive Reordering is Possible
This paper shows how ASR word lattices can be translated even when exhaustive reordering is required for good translation quality. We propose a method for labeling lattice word hypotheses with position information derived from a confusion network (CN). This information is effectively used in the statistical phrase-based machine translation (MT) search to reduce its complexity, which makes even ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 13 شماره
صفحات -
تاریخ انتشار 2017